Churn prediction in telecom using Random Forest and PSO based data balancing in combination with various feature selection strategies
نویسندگان
چکیده
The telecommunication industry faces fierce competition to retain customers, and therefore requires an efficient churn prediction model to monitor the customer’s churn. Enormous size, high dimensionality and imbalanced nature of telecommunication datasets are main hurdles in attaining the desired performance for churn prediction. In this study, we investigate the significance of a Particle Swarm Optimization (PSO) based undersampling method to handle the imbalance data distribution in collaboration with different feature reduction techniques such as Principle Component Analysis (PCA), Fisher’s ratio, F-score and Minimum Redundancy and Maximum Relevance (mRMR). Whereas Random Forest (RF) and K Nearest Neighbour (KNN) classifiers are employed to evaluate the performance on optimally sampled and reduced features dataset. Performance is evaluated using sensitivity, specificity and Area under the curve (AUC) based measures. Finally, it is observed through simulations that our proposed approach based on PSO, mRMR, and RF termed as Chr-PmRF, performs quite well for predicting churners and therefore can be beneficial for highly competitive telecommunication industry.
منابع مشابه
A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)
Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...
متن کاملApplication of Feature Extraction Method in Customer Churn Prediction Based on Random Forest and Transduction
With the development of telecom business, customer churn prediction becomes more and more important. An outstanding issue in customer churn prediction is high dimensional problem. Curse of dimensionality will easily occur if effective feature extraction is not applied during modeling. Among the most popular feature extraction approaches, principal component analysis (PCA) method based on induct...
متن کاملPredicting credit card customer churn in banks using data mining
In this paper, we solve the customer credit card churn prediction via data mining. We developed an ensemble system incorporating majority voting and involving Multilayer Perceptron (MLP), Logistic Regression (LR), decision trees (J48), Random Forest (RF), Radial Basis Function (RBF) network and Support Vector Machine (SVM) as the constituents. The dataset was taken from the Business Intelligenc...
متن کاملDimensionality and data reduction in telecom churn prediction
Purpose – Churn prediction is a very important task for successful customer relationship management. In general, churn prediction can be achieved by many data mining techniques. However, during data mining, dimensionality reduction (or feature selection) and data reduction are the two important data preprocessing steps. In particular, the aims of feature selection and data reduction are to filt...
متن کاملNeighborhood Cleaning Rules and Particle Swarm Optimization for Predicting Customer Churn Behavior in Telecom Industry
Churn prediction is an important task for Customer Relationship Management (CRM) in telecommunication companies. Accurate churn prediction helps CRM in planning effective strategies to retain their valuable customers. However, churn prediction is a complex and challenging task. In this paper, a hybrid churn prediction model is proposed based on combining two approaches; Neighborhood Cleaning Ru...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computers & Electrical Engineering
دوره 38 شماره
صفحات -
تاریخ انتشار 2012